#speech recognition
#speech recognition

1 week ago

Google quietly releases an offline-first AI dictation app on iOS | TechCrunch

Google released an offline-first dictation app called Google AI Edge Eloquent for iOS, featuring advanced speech recognition and text editing capabilities.

fromwww.businessinsider.com

fromTheregister

1 week ago

Microsoft shivs OpenAI with new AI models for speech, images

Microsoft launched public preview versions of machine learning models for speech recognition, speech synthesis, and image generation, competing directly with OpenAI.

#ai

Artificial intelligence

The AI tech my dad helped pioneer is now the foundation for the tools I build at AT&T

fromArs Technica

Mobile UX

The debut of Gemini 3.1 Flash Live could make it harder to know if you're talking to a robot

fromFast Company

3 months ago

Startup companies

This AI startup is extending an olive branch between humans and machines

Artificial intelligence

How to Use AI in Video Calls | ClickUp

fromInside Higher Ed | Higher Education News, Events and Jobs

Artificial intelligence

Mistral releases Voxtral, its first open source AI audio model | TechCrunch

9 months ago

Artificial intelligence

Howard and Google Aim to Improve AI Tech for Black Users

fromwww.businessinsider.com

The AI tech my dad helped pioneer is now the foundation for the tools I build at AT&T

Natalie Gilbert's work in AI is rooted in her father's foundational research in speech recognition at AT&T's Bell Labs.

Mobile UX

fromArs Technica

The debut of Gemini 3.1 Flash Live could make it harder to know if you're talking to a robot

Gemini 3.1 Flash Live enhances audio interaction, mimicking human speech with AI flags for authenticity detection.

fromFast Company

3 months ago

Startup companies

This AI startup is extending an olive branch between humans and machines

Artificial intelligence

How to Use AI in Video Calls | ClickUp

fromInside Higher Ed | Higher Education News, Events and Jobs

Artificial intelligence

Mistral releases Voxtral, its first open source AI audio model | TechCrunch

9 months ago

Artificial intelligence

Howard and Google Aim to Improve AI Tech for Black Users

more#ai

Cohere launches an open-source voice model specifically for transcription | TechCrunch

Cohere's Transcribe model is designed for tasks like note-taking and speech analysis, supporting 14 languages and optimized for consumer-grade GPUs, making it accessible for self-hosting.

European startups

Typography

fromMail Online

1 month ago

The UK's hardest accents to understand - with Essex at top of the list

The Essex accent is the most difficult for automated speech-to-text systems to understand, while the Mancunian accent is the easiest.

Business

fromEntrepreneur

1 month ago

Grow Your Global Business Reach: Learn a New Language With Rosetta Stone

Mastering new languages with Rosetta Stone's lifetime subscription builds cross-cultural trust, strengthens partnerships, and offers immersive speech-recognition training across 25 languages.

fromHubspot

3 years ago

Voice search optimization: How to get your business heard about

From smartphones to smart speakers and smart TVs, conducting web searches with our voices is common. In many cases, it's even faster, more convenient, and easier than typing in a query. That's likely why the global speech and voice recognition market is projected to grow from $9.66 billion in 2025 to $23.11 billion by 2030. Here's the catch, though: Voice search isn't the same as a text search.

Marketing tech

fromAbove the Law

2 months ago

Why Solo And Small Firm Lawyers Should Make Voice Their Choice For AI - Above the Law

Voice-based drafting, powered by modern AI transcription, is faster and increasingly accurate, offering a practical shift from keyboard to voice for solos and small firms.

fromApp Developer Magazine

Why MedGemma 1.5 matters more than the headlines

MedGemma 1.5 and MedASR provide practical, open tools improving integration of 3D medical imaging, clinical text, and speech for healthcare developers.

Healthcare

2 months ago

Dutch healthcare AI Juvoly acquired by Swedish Tandem Health

Juvoly is being acquired by Tandem Health to scale operations, expand across Europe, and accelerate voice-controlled AI reporting for healthcare with added development and certification capacity.

#speech-recognition

Artificial intelligence

Speechify adds voice typing and voice assistant to its Chrome extension | TechCrunch

fromAxios

Artificial intelligence

AI's listening gap is fueling bias in jobs, schools and health care

fromInfoQ

Artificial intelligence

Mistral Voxtral is an Open-Weights Competitor to OpenAI Whisper and Other ASR Tools

Artificial intelligence

Mistral launches Voxtral: open-source speech recognition for businesses

Artificial intelligence

SpeechVerse vs. SOTA: Multi-Task Speech Models in Real-World Benchmarks | HackerNoon

Artificial intelligence

Evaluating Multimodal Speech Models Across Diverse Audio Tasks | HackerNoon

Artificial intelligence

Speechify adds voice typing and voice assistant to its Chrome extension | TechCrunch

fromAxios

Artificial intelligence

AI's listening gap is fueling bias in jobs, schools and health care

fromInfoQ

Artificial intelligence

Mistral Voxtral is an Open-Weights Competitor to OpenAI Whisper and Other ASR Tools

Artificial intelligence

Mistral launches Voxtral: open-source speech recognition for businesses

Artificial intelligence

SpeechVerse vs. SOTA: Multi-Task Speech Models in Real-World Benchmarks | HackerNoon

Artificial intelligence

Evaluating Multimodal Speech Models Across Diverse Audio Tasks | HackerNoon

more#speech-recognition

5 months ago

AI speech model aiOla Drax outpaces OpenAI & Alibaba

As explained in this video, flow-matching-based generative methods are a class of models that learn a "continuous vector field" in order to manage and transform what are relatively simple "noise distributions" into more complex data distributions. They do this by following ordinary differential equations. Instead of learning "discrete denoising steps" (that's what diffusion models do), they train the flow to match probability paths directly between data and noise.

Artificial intelligence

Startup companies

5 months ago

Subtle Computing's voice isolation models help computers understand you in noisy environments | TechCrunch

Subtle Computing builds device-specific voice isolation models that preserve device acoustics to capture clean, personalized speech in noisy environments and outperform generic solutions.

fromSearch Engine Roundtable

fromFast Company

5 months ago

Inside Microsoft's quest to make Windows 11's AI irresistible

Windows 11 introduces Copilot Voice to enable spoken interactions with AI and spoken responses, continuing decades of Microsoft voice-computing efforts.

6 months ago

Google Voice Search Now Using Speech-to-Retrieval (S2R)

At its core, S2R is a technology that directly interprets and retrieves information from a spoken query without the intermediate, and potentially flawed, step of having to create a perfect text transcript. It represents a fundamental architectural and philosophical shift in how machines process human speech.

Artificial intelligence

fromFortune

6 months ago

I tried the viral AI 'Friend' necklace everyone's talking about-and it's like wearing your senile, anxious grandmother around your neck | Fortune

An always-listening AI necklace marketed for contextual emotional support failed to deliver reliable, timely, or truly contextual help during an emotional crisis.

Voice Recognition vs Speech Recognition: What You Need to Know

You've probably used both technologies this week without realizing it. When Siri transcribes your text message, that's speech recognition. When your banking app verifies it's you speaking, that's voice recognition. The terms are often used interchangeably, but they address completely different problems. And as artificial intelligence gets better at faking human speech, understanding voice recognition vs. speech recognition becomes critical for anyone building secure systems.

Artificial intelligence

Gadgets

fromDesign Milk

Timekettle W4 AI Interpreter Earbuds Streamline Translation

Timekettle's W4 AI Interpreter Earbuds provide near-instant, AI-powered multilingual translation with Bone-voiceprint sensors, dual-voice pickup, noise filtering, and 98% accuracy across 42 languages.

Education

fromEntrepreneur

Use Rosetta Stone to Impress Clients Around the World with Fluent, Natural Speech | Entrepreneur

Lifetime Rosetta Stone access to 25 languages with speech-recognition and immersive lessons is available for new users for $148.97 using code FLUENT until September 7.

fromTheregister

Transcription app Otter.ai accused of illegal recordings

"Otter tries to shift responsibility, outsourcing its legal obligations to its accountholders, rather than seeking permission and consent from the individuals Otter records, as required by law."

Privacy professionals

Whisper vs. Google Speech-to-Text: Which One Should You Use?

Whisper excels in multilingual transcription, supporting a variety of languages and offering consistent accuracy, making it suitable for global applications and media projects.

Artificial intelligence

Online marketing

10 Best Whisper AI Alternatives for Transcription in 2025 | ClickUp

Whisper AI has limitations in real-time features and collaboration.

#ai-technology

fromwww.bbc.com

Artificial intelligence

New AI voice tool trained to copy British regional accents

10 months ago

Women in technology

Wispr Flow releases iOS app in a bid to make dictation feel effortless | TechCrunch

fromwww.bbc.com

Artificial intelligence

New AI voice tool trained to copy British regional accents

10 months ago

Women in technology

Wispr Flow releases iOS app in a bid to make dictation feel effortless | TechCrunch

The Mother of Communication

Babies begin language learning in the womb by recognizing their mother's voice and speech patterns.